Memory-Efficient GroupBy-Aggregate using Compressed Buffer Trees

ثبت نشده

چکیده

Memory is rapidly becoming a precious resource in many data processing environments. This paper introduces a new data structure called a Compressed Buffer Tree (CBT). Using a combination of buffering, compression, and lazy aggregation, CBTs can improve the memory efficiency of the GroupBy-Aggregate abstraction which forms the basis of many data processing models like MapReduce and databases. We evaluate CBTs in the context of MapReduce aggregation, and show that CBTs can provide significant advantages over existing hashbased aggregation techniques: up to 2× less memory and 1.5× the throughput, at the cost of 2.5× CPU.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

PowerQ: An Interactive Keyword Search Engine for Aggregate Queries on Relational Databases

Keyword search over relational databases has gained popularity due to its ease of use. Current research has focused on the efficient computation of results from multiple tuples, and largely ignores queries to retrieve statistical information from databases. The work in [5] developed a system that allows aggregate functions to be expressed using simple keywords. However, this system may return i...

متن کامل

Ultra High Speed Packet Buffering using “Parallel Packet Buffer”

Modern switches and routers often use dynamic RAM (DRAM) in order to provide large buffer storage space. However, the effective bandwidth of DRAM is frequently a limiting factor in the design of high-speed switches and routers. The focus of this paper is to introduce a packet-buffering architecture called the parallel packet buffering (PPB), which increases the effective memory bandwidth signif...

متن کامل

Elf: Efficient lightweight fast stream processing at scale

Stream processing has become a key means for gaining rapid insights from webserver-captured data. Challenges include how to scale to numerous, concurrently running streaming jobs, to coordinate across those jobs to share insights, to make online changes to job functions to adapt to new requirements or data characteristics, and for each job, to efficiently operate over different time windows. Th...

متن کامل

Fast, Small and Exact: Infinite-order Language Modelling with Compressed Suffix Trees

Efficient methods for storing and querying are critical for scaling high-order m-gram language models to large corpora. We propose a language model based on compressed suffix trees, a representation that is highly compact and can be easily held in memory, while supporting queries needed in computing language model probabilities on-the-fly. We present several optimisations which improve query ru...

متن کامل

Answering Keyword Queries involving Aggregates and GROUPBY on Relational Databases

Keyword search over relational databases has gained popularity as it provides a user-friendly way to explore structured data. Current research in keyword search has largely ignored queries to retrieve statistical information from the database. The work in [13] extends keywords by supporting aggregate functions in their SQAK system. However, SQAK does not consider the semantics of objects and re...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2012

Memory-Efficient GroupBy-Aggregate using Compressed Buffer Trees

ثبت نشده

چکیده

منابع مشابه

PowerQ: An Interactive Keyword Search Engine for Aggregate Queries on Relational Databases

Ultra High Speed Packet Buffering using “Parallel Packet Buffer”

Elf: Efficient lightweight fast stream processing at scale

Fast, Small and Exact: Infinite-order Language Modelling with Compressed Suffix Trees

Answering Keyword Queries involving Aggregates and GROUPBY on Relational Databases

عنوان ژورنال:

اشتراک گذاری